Genetic sub-structuring of Croatian island populations in the Southeastern European context: a meta-analysis

Aim To use the method of meta-analysis to assess the influence of island population isolation on the sub-structuring of the Croatian population, as well as the influence of regional population groups on the sub-structuring of the Southeastern European population with regard to basic population genetic statistical parameters calculated by using STR locus analysis. Methods Bio-statistical analyses were performed for 2877 unrelated participants of both sexes from Southeastern Europe. Nine autosomal STR loci (D3S1358, vWA, FGA, TH01, TPOX, CSF1PO, D5S818, D13S317, and D7S82) were analyzed by using standard F-statistics and population structure analysis (Structure software). Results Genetic differentiation of Croatian subpopulations assessed with the FST method was higher at the level of the Croatian population (0.005) than at the level of Southeastern Europe (0.002). The island of Vis showed the most pronounced separation in the Croatian population, and Albanians from Kosovo in the population of Southeast Europe, followed by Croatia, Bosnia and Herzegovina, and Hungary. Conclusion The higher structure of Croatian subpopulations in relation to Southeastern Europe suggest a certain degree of genetic isolation, most likely due to the influence of endogamy within rural island populations.

Aim To use the method of meta-analysis to assess the influence of island population isolation on the sub-structuring of the Croatian population, as well as the influence of regional population groups on the sub-structuring of the Southeastern European population with regard to basic population genetic statistical parameters calculated by using STR locus analysis.
Results Genetic differentiation of Croatian subpopulations assessed with the F ST method was higher at the level of the Croatian population (0.005) than at the level of Southeastern Europe (0.002). The island of Vis showed the most pronounced separation in the Croatian population, and Albanians from Kosovo in the population of Southeast Europe, followed by Croatia, Bosnia and Herzegovina, and Hungary.
Conclusion The higher structure of Croatian subpopulations in relation to Southeastern Europe suggest a certain degree of genetic isolation, most likely due to the influence of endogamy within rural island populations.
The island populations of the eastern Adriatic have been the subject of multidisciplinary anthropological research for almost 50 years, starting with the pioneering work of Rudan et al in 1972 (1). A number of specific features of these rural populations has been revealed, which make them exceptional models for studying ethno-cultural, historical, migratory, and demographic characteristics of this region. More specifically, evolutionary forces (bottleneck effect and genetic drift) increase genome homogeneity within the genetic structure of such island isolates by eliminating certain genetic traits in favor of others and increasing the likelihood of finding low-impact alleles (2,3). The reduced genetic and environmental diversity makes genetically isolated populations suitable for the study of different complex and rare Mendelian hereditary diseases, since the combined action of genetic drift, inbreeding, and founder effect increases the prevalence of such diseases when compared with the general population.
Southeastern Europe was one of Europe's glacial refugia during the ice age, and the origin of postglacial resettlement of Europe in the Paleolithic and Neolithic. Due to this specific role and its position at the crossroads of migrations to and from Europe, this area was extensively investigated in the field of population genetics (4)(5)(6)(7). Different genetic markers have been used to investigate the genetic landscape of Europe and determine the patterns of population sub-structuring at the regional and continental level (8). As a part of the comprehensive anthropological research on the population structure of Croatian island isolates, microsatellite DNA from different subpopulations has been previously analyzed to determine genetic diversity, population structure, and the degree of isolation of island populations (9). Similar studies were also conducted on a representative sample of the general Croatian population and other isolated populations from Southeastern Europe (10)(11)(12).
This study represents a continuation of previous anthropogenetic research (6,(13)(14)(15)(16). We used statistical and analytical methods of meta-analysis to synthesize data from previously conducted, mutually independent studies of island and continental populations of Croatia and Southeastern Europe ( Figure 1) based on analyses of autosomal STR markers, and data analyzed in this study for the first time.
The aim of this study was to determine the genetic characteristics of populations from Southeastern Europe, with special reference to Croatian island populations, and to investigate the effect of specific intrapopulation genetic structure on interpopulation relationships. Namely, a specific aim was to investigate the influence of island population isolation on the sub-structuring of the Croatian population, and the influence of regional population groups on the sub-structuring of Southeastern Europe with regard to basic population genetic statistical parameters calculated by using STR locus analysis.

MATeRiAL AND MeTHoDs sample
The samples used were the same as described in a previous article by our research group (17). Certain analyses of autosomal STR markers were conducted for the first time in this study and some are from previous research performed by various authors (9,(14)(15)(16)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28). This article integrates all these studies using the statistical and analytical method of meta-analysis, as defined by Rosenbald (29).
sTR marker analysis STR marker analysis was performed by salting out method (30) as described in a previous article by our research group (17).
The standard F-statistics (31), which describes the level of inbreeding within a subpopulation (F IS ), between subpopulations (F ST ), and within the total population (F IT ), was used as a measure of correlation between alleles. Clustering was performed with the Ward hierarchical method (32,33) and the results were presented as dendrograms created with Statistica 9 (StatSoft, TIBCO Software, Dell, Round Rock, TX, USA).
To determine whether there are significant differences in allelic frequencies, ie, population structure in the entire population of isolated island subpopulations in Croatia and among the populations of Southeast Europe, a population structure analysis was performed (34) by using the program Structure 2.3.3 (Stanford University, Stanford, CA, USA). The limitation of low levels of population differentiation (FST<0.02) was overcome by using the sampling location parameter (35). The "LocPrior" option of program settings, which enables the determination of the structure at lower levels of separation was used (36). Tests K from 1 to 10 were performed, and each run was repeated 10 times. To determine the most reliable K value, ΔK was calculated based on the rate of change lnP (D) between individual K values. In order to determine ΔK, the results of all analyses were processed in Structure Harvester v. 0.6 (37,38). The analysis conducted at the level of the Croatian population included a sample of all island subpopulations (n = 733), excluding the mainland population. The analysis conducted at the level of Southeastern Europe included a sample of Southeastern European populations (n = 1805). The Croatian sample analyzed in the broader Southeastern European context was reduced from n = 1230 to n = 158 to mirror the portions of these populations in the actual sample of Croatia and to avoid sample bias (39).

Genetic differentiation of Croatian subpopulations
The total genetic differentiation coefficients (F ST ) for the compared population pairs (Table 1) and for each analyzed locus (Table 2) were estimated to assess the genetic distances between the analyzed Croatian subpopulations.
Most of the analyzed pairs showed a relatively small but significant level of genetic differentiation. The only excep-tions were two analyzed pairs: mainland-island of Brač and island of Krk-North Dalmatian islands (NDI), with no significant difference at any locus. The lowest degree of total genetic differentiation was found between the mainland The grouping results for Croatian subpopulations based on F ST genetic distances, obtained by the Ward method, are presented as a dendrogram ( Figure 2). Four main clusters are visible. The first one includes the populations of the mainland and the islands of Brač, Krk and NDI. The second cluster includes the islands of Hvar and Korčula. The islands of Cres and Vis can be considered the third and the fourth cluster, respectively. These two islands also have the smallest population, which affected the reduced genetic diversity within the islands, and the greater distance from other analyzed subpopulations. The population of Vis was distant from all the analyzed populations, and the most distant from Hvar.

Genetic differentiation of the populations of southeastern europe
Total values of F ST (Table 3) and the values of the genetic differentiation coefficient at each individual locus (Table  4) were estimated for each pair of populations to determine the kinship level. In contrast to the Croatian subpopulations for which significant levels of genetic differentiation were found in 93% of the analyzed pairs, the populations of Southeast Europe showed a significant level in only 56% of population pairs. The lowest degree of genetic differentiation was present between the populations of Serbia and Romania (0.00013), and Serbia and Montenegro (0.00023). Significant differences between these populations were not found at any locus. On the other hand, the highest degree of genetic differentiation was found between the populations of Hungary and Albanians from Kosovo

structure assessment of Croatian subpopulations
Population genetic structure at the individual level was additionally estimated with the Structure program (34). A sample of exclusively island populations was used, since a preliminary study including the mainland sample deter-    mined the most reliable value for K = 1. In order to determine the most reliable K, the obtained results were processed with Structure Harvester. ΔK values were obtained, which more accurately estimate the value of K. A strong signal for K = 6 is visible in Figure 4, suggesting that the previously defined 7 island populations were grouped into 6  separate populations. Based on the determined most reliable K value (K = 6), the optimal presentation was 6 genetically different founder populations. Krk and NDI grouped together, and were separated from other populations at K = 4 and above. Furthermore, the separation of all other populations is visible. Cres and Vis were already separated at K = 3, while Brač and Hvar were separated at K = 5. The population of the island of Korčula was also separated from all others already at K = 5.

structure assessment of southeastern european populations
The posterior probability (lnP (D)) obtained using the Structure program was highest for K = 1 and gradually decreased for each subsequent K. According to the second criterion for determining the most reliable K (40), taking into account ΔK, a strong signal was evident for K = 3, which would indicate the division of the Southeastern European sample into three groups. Population structure analysis of Southeastern European populations is presented for K = 3 to K = 7 ( Figure 5). Based on the determined most reliable K value (K = 3), the optimal presentation was three genetically different founder populations. Thus, at K = 3, the population of Albanians from Kosovo stands out, and, to a lesser extent, the population of Croatia, Bosnia and Herzegovina, and Hungary.

Genetic structure analysis
The total genetic differentiation coefficient (F ST ) among all analyzed subpopulations of Croatia was low and amounted to 0.005, which means that only 0.5% of genetic diversity was influenced by differences between subpopulations. Since these geographically close populations share a common evolutionary history but have biological and socio-cultural specifics that have differently shaped their genetic structure, this study investigated how these populations related to each other and what the degree of their isolation was.
When looking at individual pairs of populations, 86% of population pairs were genetically different from each other as much as any two randomly selected European populations. Namely, among the largest European countries, the conservative upper limit of F ST values was 1% according to the National Research Council (NRC) (41). This was significantly above the F ST value of 0.0028, ie, 0.28% obtained based on STR marker analysis in 11 different European countries (42). However, according to the NRC, the F ST value for isolated populations was 0.03, ie, 3%. Taking into account the recommended limit for isolated populations, the F ST values of all analyzed island population pairs in this study were much lower than stated, and therefore the limit of 0.01 (1%) would be more appropriate.
Out of a total of 28 population pairs, no difference was found for two Croatian population pairs (mainland-Brač and Krk-NDI) at any of the analyzed loci. This finding might be explained by the fact that Brač and Krk are close to the mainland with good transportation connections and therefore less isolated than outer islands. Of the remaining population pairs, 14% differed in a slightly higher percentage. Thus, the highest degree of genetic differentiation was observed between Hvar and Vis (1.6%), which indicates an extremely high genetic diversity of these two populations. The remaining three population pairs with F ST values >1% were Cres and Vis, Korčula and Vis, and Cres and Hvar, which also indicates the genetic diversity of these population pairs. On the other hand, allelic frequencies on the island of Brač and mainland, and Krk and NDI, which according to historical demographic data were founded by genetically similar ancestors, did not differ significantly. Previous studies of Eastern Adriatic islands observed a high degree of diversity (F ST ) among most of the investigated population pairs. The diversity within the islands, ie, their  different color), and the length of these segments is proportional to the estimated share in the statistically determined genetic group. since the sample size disparity may affect the determination of population structure (40), the total number of samples for Croatian population (n = 1230) was reduced (n = 158). samples were randomly selected. *B&H -Bosnia and Herzegovina settlements (subpopulations), has, for example, been established on Hvar, Krk, Brač, and Korčula (2,15). The greatest genetic similarity in this study was observed between the populations of mainland, Brač, Krk, and NDI, while Vis, as one of the remotest inhabited island in the Adriatic, was most different from all other studied subpopulations.
Genetic differentiation of Southeastern European populations was also assessed with the F ST index. Due to the reduced impact of endogamy at a higher level of population grouping, a lower coefficient of genetic diversity was found among the populations of Southeastern Europe -only 0.16%, which indicates a homogeneous distribution of alleles of the studied loci at the level of "general" populations of Southeast Europe. Similar F ST values (0.28%) were found among 11 European countries in an analysis based on microsatellite markers (42). The reason for this finding could be the socio-cultural origin of these populations in comparison with other studied European populations. Namely, they are the descendants of Illyrians, the autochthonous population of this area, which gradually intermixed with Romans, Slavs, and other more recent newcomers in the history of this area (43).

structure of island subpopulations according to the Bayesian approach
The genetic population structure at the individual level was also estimated by using the Structure software (34). A preliminary analysis, which included all Croatian subpopulations, determined the most reliable value of K = 1, ie, did not find a structure within the sample. Similar results were presented previously by Martinović Klarić et al (9). Given that human populations generally show a low degree of genetic differentiation, this result was expected. Namely, inter-population differences in Europe are very low (F ST = 0.7%) (39), despite the presence of specific populations such as the Basques or the population of Sardinia. Due to the established low level of genetic differentiation (F ST ) between defined (ancestral) populations of Croatia (from 0.001 to 0.016), the "LocPrior" model was used (36).
Furthermore, the mainland population was represented by a significantly larger number of participants (N = 497) than island populations (N = 82-137). Since disproportionate sample sizes may affect the determination of population structure (39), we hypothesized that isolating the mainland population from the sample will make it easier to find structure among the islands. The results of the analysis of 9 STR loci conducted in this study showed that even such a small number of loci with high heterozygosity is sufficient to determine structural division in case of population differentiation. Namely, in small isolated populations allelic frequencies can become significantly different from the founding population in a very short period of time due to genetic drift (44). The populations of the island of Krk and NDI exhibited a small genetic distance and can be considered as one population. The most pronounced separation was shown for the island of Vis, followed by the island of Cres, which is consistent with their separation into separate clusters based on genetic distances.
The established and most reliable value of K = 6 shows that these seven pre-defined island populations are grouped into six predicted, genetically different founder populations. This speaks in favor of the continued isolation of the eastern Adriatic Island communities and their mutual genetic diversity, and the existence of weak, but existing substructuring at the Croatian population level.

Population structure of southeastern europe according to the Bayesian approach
The analysis showed greater homogeneity of the populations of this hierarchical group. Namely, while island subpopulations were separated into six predicted genetically different founder populations (K = 6), populations of Southeastern Europe split into only three different founder populations (K = 3). Due to the reduced influence of endogamy, lower genetic differentiation was found in Southeastern Europe than at the level of the Croatian subpopulations. To a lesser extent, the segregation of the populations of Croatia and Bosnia and Herzegovina and Hungary was visible, which is in line with their grouping into a common cluster based on F ST analysis. This finding is not surprising, since they are geographically neighboring populations.
The isolation of the Albanian population from Kosovo confirms the results of previous analyses (27), where the largest genetic distance was determined for this population, as well as its separation into a special cluster. Albanians are non-Slavic speakers in the Western Balkans region. They are believed to be descendants of Illyrians with different cultural, demographic, and linguistic history compared with the neighboring populations of Slavic origin. Despite their widespread migration all over the European continent, traditional social-grouping of Albanians still remains strong, which may explain long-term genetic isolation (45,46). This meta-analysis provides a systematic overview of the genetic sub-structuring in Croatia and in a wider Southeast-European context. It also highlights the importance of isolated island population in the making of a population's genetic landscape. There are certain limitations to this study. A meta-analysis includes data from many different sources, which has certain disadvantages. Among others, the number of STRs included in the study had to be reduced in order to enable comparisons. However, even with a limited number of STRs (only nine) a sub-structure was detected.

Conclusion
The total genetic differentiation coefficient of Croatian subpopulations calculated by the F ST method was higher at the level of the Croatian population (0.005) than at the level of Southeast Europe (0.002). Namely, the assessment of the genetic population structure for Croatia defined 6, and for the population of Southeastern Europe 3 genetically different clusters. In the population of Croatia, the subpopulation of the island of Vis showed the most pronounced separation, and in the population of Southeastern Europe the population of Albanians from Kosovo, followed by the populations of Croatia, Bosnia and Herzegovina, and Hungary. The established higher structure of Croatian subpopulations in relation to Southeastern Europe suggests the existence of a certain degree of genetic isolation, most likely due to the influence of endogamy within rural island populations.